Image‐level dataset synthesis with an end‐to‐end trainable framework

نویسندگان

چکیده

Dataset synthesis via virtual engines like Unity is attracting much more attention in recent years due to its low cost at obtaining ground-truth labels. For this kind of work, environments are constructed within the engine mimic real-world, either with great manual efforts or learning-based methods. The latter shows superiority over former when target real-world scenes changeable, from which attributes can be automatically adjusted based on distribution difference between synthetic and datasets. However, non-differentiability whole pipeline hinders efficiency attribute optimization. To end, paper proposes simulate datasets a fine-grained perspective, such that system trained an end-to-end manner. Specifically, it converted into image-level data problem, designs constraint using content loss two images. As rendering process mathematically unknown, blocks back propagation gradients, generative model approximate engine. result, framework becomes fully differentiable optimized efficiently by gradient descent. Experimental result our method useful training Besides, found enables learn potential data, hard achieved existing far as we know, first attempt finish task process.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Trainable Speech Synthesis

This thesis is concerned with the synthesis of speech using trainable systems. The research it describes was conducted with two principle aims: to build a hidden Markov model (HMM) based speech synthesis system which could synthesise very high quality speech; and to ensure that all the parameters used by the system were obtained through training. The motivation behind the rst of these aims was ...

متن کامل

Prior knowledge in an end-user trainable machine vision framework

The increasing popularity of machine vision based solutions in common applications calls for a structured approach for incorporating the end user’s domain knowledge and limiting the solution’s dependency on expert knowledge. We propose a framework facilitating optimized classification results and will show several approaches in which prior knowledge of the solution is captured in a neural netwo...

متن کامل

Trainable speech synthesis with trended hidden Markov models

In this paper we present a trainable speech synthesis system that uses the trended Hidden Markov Model to generate the trajectories of spectral features of synthesis units. The synthesis units are trained from a transcribed continuous speech corpus, making the speech more natural than that produced by conventional diphone synthesisers which are generally trained from a highly articulated speech...

متن کامل

Integration of Intonation in Trainable Speech Synthesis

Current developments in artificial speech synthesis place more emphasis on spectral continuities and diverse prosodic effects. The trainable HMM-based speech synthesis method has generated more continuous spectral structure than unit selection method in recent study, but the pitch contour generated by HMM-based method trends to be over-smoothed and lacks syllable variance in Chinese. In this pa...

متن کامل

The IBM trainable speech synthesis system

The speech synthesis system described in this paper uses a set of speaker-dependent decision-tree state-clustered hidden Markov models to automatically generate a leaf level segmentation of a large single-speaker continuous-read-speech database. During synthesis, the phone sequence to be synthesised is converted to an acoustic leaf sequence by descending the HMM decision trees. Duration, energy...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Iet Image Processing

سال: 2022

ISSN: ['1751-9659', '1751-9667']

DOI: https://doi.org/10.1049/ipr2.12486